So good morning everyone.
This is the second demo session of today's computer vision lecture.
As you might remember the last time we did traditional methods and now we are focusing
on especially learning based methods.
And I also think that the advertisement that you heard before is pretty much also maybe
a hint how important computer vision is nowadays.
And in this lecture you will also see maybe possible methods how you could solve the problems
that were mentioned before in the advertisement.
So this session is split into two parts.
First of all we are examining 2D scene understanding or in general 2D computer vision how we can
sort of gain knowledge from 2D data.
And in the next section of this lecture we will focus more of how we can actually gain
a 3D understanding just on the basis of 2D data.
I will first start with a little introduction just for the ones who did never see anything
in connection with deep learning methods.
And I will give a brief introduction what common tasks are in the field of 2D computer
vision.
In general those are the pretty much state of the art tasks that we have in 2D computer
vision.
You might have heard something about semantic segmentation where the goal is to segment
the scene that you're seeing just on the basis of a given image.
You might heard already about classification where the goal is just when you have one single
object in the image to classify what category this object belongs to.
And you also might have already heard about a combination of classification with localization
still with a focus on one single object because you classify the object and later you also
try to estimate the region where this object lies in as a form of 2D bounding box for example.
And now we're getting to the more interesting parts because this is getting a little bit
more complicated as soon as we have multiple objects.
Because then we have also the need to distinguish those objects from each other.
And we start with object detection where the goal is basically the same as in the single
object case.
We want to classify the objects we are seeing and we also want to estimate the region where
they are located.
And the combination with multiple object is called object detection.
And this is a pretty much already well searched but still pretty interesting research direction.
A lot of people are working on that improving learning methods to do object detection.
And in this example we have multiple animals that needed to be detected.
And the network or rather the learning approach that you're using is then also outputting
the class of those objects it is seeing.
And lastly we have something that is called instant segmentation.
Why is it called instant segmentation and not just semantic segmentation as we heard
before?
The crucial part about instant segmentation is now we have if we are seeing multiple objects
there could be the case that we see multiple instances of the same category.
So we can see for example multiple cats.
So we have to distinguish those cats in some sort of each other.
So the network is actually assigning them instances and they are treated as completely
different objects during the learning process.
And as we already saw before those objects once will get a label.
So we have those two cats here.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:49:08 Min
Aufnahmedatum
2021-07-12
Hochgeladen am
2021-07-12 12:57:11
Sprache
en-US
In this session, Vanessa Wirth shows live demos "Learning-based methods" for Object Detection and Semantic Instance Segmentation. She also presents the CAD model alignment in indoor scenes, which also uses Bundle Fusion (Dai et. al 2017: https://dl.acm.org/doi/abs/10.1145/3072959.3054739), which was developed at FAU.
Vanessa is currently pursuing her PhD at the Chair of Visual Computing, Department of Computer Science. She has open thesis/projects related to 3D reconstruction of objects and human bodies: https://www.lgdv.tf.fau.de/person/vanessa-wirth/